Characterisation of scientific and popular science discourse in French, Japanese and Russian
نویسندگان
چکیده
We aim to characterise the comparability of corpora, we address this issue in the trilingual context through the distinction of expert and non expert documents. We work separately with corpora composed of documents from the medical domain in three languages (French, Japanese and Russian) which present an important linguistic distance between them. In our approach, documents are characterised in each language by their topic and by a discursive typology positioned at three levels of document analysis: structural, modal and lexical. The document typology is implemented with two learning algorithms (SVMlight and C4.5). Evaluation of results shows that the proposed discursive typology can be transposed from one language to another, as it indeed allows to distinguish the two aimed discourses (science and popular science). However, we observe that performances vary a lot according to languages, algorithms and types of discursive
منابع مشابه
Characterization of Scientific and Popular Science Discourse in French, Japanese and Russian
We aim to characterize the comparability of corpora, we address this issue in the trilingual context through the distinction of expert and non expert documents. We work separately with corpora composed of documents from the medical domain in three languages (French, Japanese and Russian) which present an important linguistic distance between them. In our approach, documents are characterized in...
متن کاملLanguage Features of Russian Texts of Engineering Discourse
The Article is devoted to the applied problem of identifying the linguistic features of engineering texts. The study of Russian-language texts of engineering discourse is usually of an applied nature, in our case, this applied research is caused by the need to teach foreigners who receive professional engineering education in Russia and in Russian language. The object of the research is the Rus...
متن کاملCompilation of Specialized Comparable Corpora in French and Japanese
We present in this paper the development of a specialized comparable corpora compilation tool, for which quality would be close to a manually compiled corpus. The comparability is based on three levels: domain, topic and type of discourse. Domain and topic can be filtered with the keywords used through web search. But the detection of the type of discourse needs a wide linguistic analysis. The ...
متن کاملTesting Problems in Russian as a Foreign Language in a Technical University
Problems of theory and practice of the Russian as a foreign language testing for entrants in technical universities are considered. The benefits of test forms for controlling the foreign students’ skills in the Russian language during a hard time limit are presented. The structure and content of the tests, all types of tasks offered on the entrance and final examinations in the Russian languag...
متن کاملMetadiscourse Use in Popular and Professional Science: The Case of Hedges and Boosters
The present article shows that all scientific texts included in journals, magazines, and newspapers are vulnerable to the penetration of hedges and boosters. However, it was found that scientific texts in the three corpora tended to open up the possibilities of alternative voices rather than narrowing them down. The relatively higher frequency of occurrence of hedges in comparison with booster...
متن کامل